Overview

Brought to you by YData

Dataset statistics

Number of variables14
Number of observations14776615
Missing cells7326379
Missing cells (%)3.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.1 GiB
Average record size in memory295.0 B

Variable types

Text2
Categorical2
Numeric8
DateTime2

Alerts

dropoff_latitude is highly overall correlated with pickup_latitude and 1 other fieldsHigh correlation
passenger_count is highly overall correlated with store_and_fwd_flagHigh correlation
pickup_latitude is highly overall correlated with dropoff_latitude and 2 other fieldsHigh correlation
pickup_longitude is highly overall correlated with pickup_latitudeHigh correlation
store_and_fwd_flag is highly overall correlated with dropoff_latitude and 3 other fieldsHigh correlation
trip_distance is highly overall correlated with trip_time_in_secsHigh correlation
trip_time_in_secs is highly overall correlated with trip_distanceHigh correlation
vendor_id is highly overall correlated with store_and_fwd_flagHigh correlation
store_and_fwd_flag is highly imbalanced (84.7%)Imbalance
store_and_fwd_flag has 7326207 (49.6%) missing valuesMissing
rate_code is highly skewed (γ1 = 195.533726)Skewed
pickup_latitude is highly skewed (γ1 = -127.8099616)Skewed
dropoff_latitude is highly skewed (γ1 = -141.4446314)Skewed
pickup_longitude has 267494 (1.8%) zerosZeros
pickup_latitude has 265104 (1.8%) zerosZeros
dropoff_longitude has 275657 (1.9%) zerosZeros
dropoff_latitude has 273357 (1.8%) zerosZeros

Reproduction

Analysis started2025-10-28 01:06:34.028097
Analysis finished2025-10-28 01:17:05.672391
Duration10 minutes and 31.64 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

Distinct13426
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.1 GiB
2025-10-28T12:17:05.948739image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters472851680
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st row89D227B655E5C82AECF13C3F540D4CF4
2nd row0BD7C8F5BA12B88E0B67BED28BEA73D8
3rd row0BD7C8F5BA12B88E0B67BED28BEA73D8
4th rowDFD2202EE08F7A8DC9A57B02ACB81FE2
5th rowDFD2202EE08F7A8DC9A57B02ACB81FE2
ValueCountFrequency (%)
7e1346f23960cc18d7d129fa28b63a752137
 
< 0.1%
6ffcf7a4f34ba44239636028e680e4382112
 
< 0.1%
a979cda04cfb8ba3d3acba7e8d7f06612039
 
< 0.1%
d5c7cd37ea4d372d00f0a681cdc93f111959
 
< 0.1%
849e486825860106403fb991a763bcc31957
 
< 0.1%
6fe6dff9a59c0b64be0ca64ee2699f081941
 
< 0.1%
06c961ebe7ef4d13f3ae6c005ee0f5011893
 
< 0.1%
22908753e00888cc219c875c8d5bc4f61886
 
< 0.1%
e6101a0f85312c49a5b5950e61d284dc1882
 
< 0.1%
6403bf98e4618e21c795c3b45a636d771882
 
< 0.1%
Other values (13416)14756927
99.9%
2025-10-28T12:17:06.319749image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E29839347
 
6.3%
429833374
 
6.3%
A29731461
 
6.3%
929695897
 
6.3%
529640335
 
6.3%
F29609660
 
6.3%
D29606522
 
6.3%
729545983
 
6.2%
629539893
 
6.2%
229522375
 
6.2%
Other values (6)176286833
37.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E29839347
 
6.3%
429833374
 
6.3%
A29731461
 
6.3%
929695897
 
6.3%
529640335
 
6.3%
F29609660
 
6.3%
D29606522
 
6.3%
729545983
 
6.2%
629539893
 
6.2%
229522375
 
6.2%
Other values (6)176286833
37.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E29839347
 
6.3%
429833374
 
6.3%
A29731461
 
6.3%
929695897
 
6.3%
529640335
 
6.3%
F29609660
 
6.3%
D29606522
 
6.3%
729545983
 
6.2%
629539893
 
6.2%
229522375
 
6.2%
Other values (6)176286833
37.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E29839347
 
6.3%
429833374
 
6.3%
A29731461
 
6.3%
929695897
 
6.3%
529640335
 
6.3%
F29609660
 
6.3%
D29606522
 
6.3%
729545983
 
6.2%
629539893
 
6.2%
229522375
 
6.2%
Other values (6)176286833
37.3%
Distinct32224
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.1 GiB
2025-10-28T12:17:06.855688image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters472851680
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique182 ?
Unique (%)< 0.1%

Sample

1st rowBA96DE419E711691B9445D6A6307C170
2nd row9FD8F69F0804BDB5549F40E9DA1BE472
3rd row9FD8F69F0804BDB5549F40E9DA1BE472
4th row51EE87E3205C985EF8431D850C786310
5th row51EE87E3205C985EF8431D850C786310
ValueCountFrequency (%)
00b7691d86d96aebd21dd9e138f908401933
 
< 0.1%
f49fd0d84449ae7f72f3bc492cd6c7541616
 
< 0.1%
51c1be97280a80ebfa8dad34e1956cf61603
 
< 0.1%
847349f8845a667d9ac7cdedd1c873cb1570
 
< 0.1%
ce625fd96d0fafc812a6957139b354a11557
 
< 0.1%
3d757e111c78f5cac83d44a92885d4901514
 
< 0.1%
22ca618759c716436ea3257480199a321501
 
< 0.1%
3aab94ca53fe93a64811f656906546491486
 
< 0.1%
e66e58207128619cff2d2e2c3c7ecc081442
 
< 0.1%
c9674190984ba193ffd8ddcc019804cf1390
 
< 0.1%
Other values (32214)14761003
99.9%
2025-10-28T12:17:07.280436image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C29776443
 
6.3%
E29713235
 
6.3%
829655194
 
6.3%
529608188
 
6.3%
029603323
 
6.3%
D29589934
 
6.3%
329585835
 
6.3%
729576279
 
6.3%
F29553071
 
6.2%
B29539186
 
6.2%
Other values (6)176650992
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C29776443
 
6.3%
E29713235
 
6.3%
829655194
 
6.3%
529608188
 
6.3%
029603323
 
6.3%
D29589934
 
6.3%
329585835
 
6.3%
729576279
 
6.3%
F29553071
 
6.2%
B29539186
 
6.2%
Other values (6)176650992
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C29776443
 
6.3%
E29713235
 
6.3%
829655194
 
6.3%
529608188
 
6.3%
029603323
 
6.3%
D29589934
 
6.3%
329585835
 
6.3%
729576279
 
6.3%
F29553071
 
6.2%
B29539186
 
6.2%
Other values (6)176650992
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)472851680
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C29776443
 
6.3%
E29713235
 
6.3%
829655194
 
6.3%
529608188
 
6.3%
029603323
 
6.3%
D29589934
 
6.3%
329585835
 
6.3%
729576279
 
6.3%
F29553071
 
6.2%
B29539186
 
6.2%
Other values (6)176650992
37.4%

vendor_id
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size732.8 MiB
CMT
7450899 
VTS
7325716 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters44329845
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCMT
2nd rowCMT
3rd rowCMT
4th rowCMT
5th rowCMT

Common Values

ValueCountFrequency (%)
CMT7450899
50.4%
VTS7325716
49.6%

Length

2025-10-28T12:17:07.370097image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-28T12:17:07.424333image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
cmt7450899
50.4%
vts7325716
49.6%

Most occurring characters

ValueCountFrequency (%)
T14776615
33.3%
C7450899
16.8%
M7450899
16.8%
V7325716
16.5%
S7325716
16.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)44329845
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
T14776615
33.3%
C7450899
16.8%
M7450899
16.8%
V7325716
16.5%
S7325716
16.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)44329845
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
T14776615
33.3%
C7450899
16.8%
M7450899
16.8%
V7325716
16.5%
S7325716
16.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)44329845
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
T14776615
33.3%
C7450899
16.8%
M7450899
16.8%
V7325716
16.5%
S7325716
16.5%

rate_code
Real number (ℝ)

Skewed 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0342732
Minimum0
Maximum210
Zeros667
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size112.7 MiB
2025-10-28T12:17:07.471703image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum210
Range210
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.33877148
Coefficient of variation (CV)0.32754545
Kurtosis113260.8
Mean1.0342732
Median Absolute Deviation (MAD)0
Skewness195.53373
Sum15283057
Variance0.11476612
MonotonicityNot monotonic
2025-10-28T12:17:07.519625image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
114456067
97.8%
2239160
 
1.6%
539889
 
0.3%
422831
 
0.2%
317655
 
0.1%
0667
 
< 0.1%
6315
 
< 0.1%
21011
 
< 0.1%
810
 
< 0.1%
1284
 
< 0.1%
Other values (4)6
 
< 0.1%
ValueCountFrequency (%)
0667
 
< 0.1%
114456067
97.8%
2239160
 
1.6%
317655
 
0.1%
422831
 
0.2%
539889
 
0.3%
6315
 
< 0.1%
72
 
< 0.1%
810
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
21011
 
< 0.1%
1284
 
< 0.1%
651
 
< 0.1%
282
 
< 0.1%
91
 
< 0.1%
810
 
< 0.1%
72
 
< 0.1%
6315
 
< 0.1%
539889
0.3%
422831
0.2%

store_and_fwd_flag
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)< 0.1%
Missing7326207
Missing (%)49.6%
Memory size14.1 MiB
N
7285231 
Y
 
165177

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7450408
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowN
2nd rowN
3rd rowN
4th rowN
5th rowN

Common Values

ValueCountFrequency (%)
N7285231
49.3%
Y165177
 
1.1%
(Missing)7326207
49.6%

Length

2025-10-28T12:17:07.570517image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-28T12:17:07.611592image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
n7285231
97.8%
y165177
 
2.2%

Most occurring characters

ValueCountFrequency (%)
N7285231
97.8%
Y165177
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)7450408
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N7285231
97.8%
Y165177
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7450408
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N7285231
97.8%
Y165177
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7450408
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N7285231
97.8%
Y165177
 
2.2%
Distinct2303465
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size112.7 MiB
Minimum2013-01-01 00:00:00
Maximum2013-01-31 23:59:59
Invalid dates0
Invalid dates (%)0.0%
2025-10-28T12:17:07.666205image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:17:07.734588image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2305816
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size112.7 MiB
Minimum2013-01-01 00:00:36
Maximum2013-02-01 10:33:08
Invalid dates0
Invalid dates (%)0.0%
2025-10-28T12:17:07.800422image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:17:07.870924image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6973721
Minimum0
Maximum255
Zeros166
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size112.7 MiB
2025-10-28T12:17:07.925929image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum255
Range255
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.3653958
Coefficient of variation (CV)0.80441749
Kurtosis118.26646
Mean1.6973721
Median Absolute Deviation (MAD)0
Skewness2.6812626
Sum25081414
Variance1.8643057
MonotonicityNot monotonic
2025-10-28T12:17:07.975495image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
110471701
70.9%
21986196
 
13.4%
5920006
 
6.2%
3597485
 
4.0%
6520066
 
3.5%
4280992
 
1.9%
0166
 
< 0.1%
2081
 
< 0.1%
91
 
< 0.1%
2551
 
< 0.1%
ValueCountFrequency (%)
0166
 
< 0.1%
110471701
70.9%
21986196
 
13.4%
3597485
 
4.0%
4280992
 
1.9%
5920006
 
6.2%
6520066
 
3.5%
91
 
< 0.1%
2081
 
< 0.1%
2551
 
< 0.1%
ValueCountFrequency (%)
2551
 
< 0.1%
2081
 
< 0.1%
91
 
< 0.1%
6520066
 
3.5%
5920006
 
6.2%
4280992
 
1.9%
3597485
 
4.0%
21986196
 
13.4%
110471701
70.9%
0166
 
< 0.1%

trip_time_in_secs
Real number (ℝ)

High correlation 

Distinct6594
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean683.42359
Minimum0
Maximum10800
Zeros34185
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size112.7 MiB
2025-10-28T12:17:08.035724image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile177
Q1360
median554
Q3885
95-th percentile1614
Maximum10800
Range10800
Interquartile range (IQR)525

Descriptive statistics

Standard deviation494.40626
Coefficient of variation (CV)0.7234258
Kurtosis10.977518
Mean683.42359
Median Absolute Deviation (MAD)252
Skewness2.2749304
Sum1.0098687 × 1010
Variance244437.55
MonotonicityNot monotonic
2025-10-28T12:17:08.099508image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360552966
 
3.7%
420545596
 
3.7%
300533997
 
3.6%
480520228
 
3.5%
540483797
 
3.3%
240470535
 
3.2%
600440683
 
3.0%
660396544
 
2.7%
720354681
 
2.4%
180351204
 
2.4%
Other values (6584)10126384
68.5%
ValueCountFrequency (%)
034185
0.2%
11432
 
< 0.1%
22867
 
< 0.1%
32212
 
< 0.1%
42121
 
< 0.1%
51549
 
< 0.1%
61238
 
< 0.1%
71165
 
< 0.1%
81099
 
< 0.1%
91041
 
< 0.1%
ValueCountFrequency (%)
108001
 
< 0.1%
107401
 
< 0.1%
106803
< 0.1%
106202
< 0.1%
105601
 
< 0.1%
103803
< 0.1%
103203
< 0.1%
102651
 
< 0.1%
102602
< 0.1%
102001
 
< 0.1%

trip_distance
Real number (ℝ)

High correlation 

Distinct4368
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7709757
Minimum0
Maximum100
Zeros83376
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size112.7 MiB
2025-10-28T12:17:08.161503image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q11
median1.7
Q33.06
95-th percentile9.4
Maximum100
Range100
Interquartile range (IQR)2.06

Descriptive statistics

Standard deviation3.3059235
Coefficient of variation (CV)1.193054
Kurtosis29.057078
Mean2.7709757
Median Absolute Deviation (MAD)0.83
Skewness3.8365388
Sum40945641
Variance10.92913
MonotonicityNot monotonic
2025-10-28T12:17:08.223188image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.9351766
 
2.4%
1351498
 
2.4%
0.8345082
 
2.3%
1.1337293
 
2.3%
1.2322671
 
2.2%
0.7321976
 
2.2%
1.3304896
 
2.1%
1.4288759
 
2.0%
0.6280786
 
1.9%
1.5271872
 
1.8%
Other values (4358)11600016
78.5%
ValueCountFrequency (%)
083376
0.6%
0.012968
 
< 0.1%
0.022479
 
< 0.1%
0.032508
 
< 0.1%
0.042979
 
< 0.1%
0.053643
 
< 0.1%
0.064085
 
< 0.1%
0.074447
 
< 0.1%
0.084814
 
< 0.1%
0.094739
 
< 0.1%
ValueCountFrequency (%)
1005
< 0.1%
99.91
 
< 0.1%
99.81
 
< 0.1%
99.61
 
< 0.1%
99.31
 
< 0.1%
99.21
 
< 0.1%
991
 
< 0.1%
98.92
 
< 0.1%
98.81
 
< 0.1%
98.72
 
< 0.1%

pickup_longitude
Real number (ℝ)

High correlation  Zeros 

Distinct40442
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-72.63634
Minimum-2771.2854
Maximum112.40418
Zeros267494
Zeros (%)1.8%
Negative14509074
Negative (%)98.2%
Memory size112.7 MiB
2025-10-28T12:17:08.300872image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-2771.2854
5-th percentile-74.006592
Q1-73.991882
median-73.981659
Q3-73.966843
95-th percentile-73.873047
Maximum112.40418
Range2883.6896
Interquartile range (IQR)0.025039

Descriptive statistics

Standard deviation10.138193
Coefficient of variation (CV)-0.13957466
Kurtosis1821.8679
Mean-72.63634
Median Absolute Deviation (MAD)0.011879
Skewness-2.3528999
Sum-1.0733192 × 109
Variance102.78295
MonotonicityNot monotonic
2025-10-28T12:17:08.368069image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0267494
 
1.8%
-73.9820795506
 
< 0.1%
-73.9822395392
 
< 0.1%
-73.9822245371
 
< 0.1%
-73.9822085294
 
< 0.1%
-73.9821245118
 
< 0.1%
-73.9822625073
 
< 0.1%
-73.9823685001
 
< 0.1%
-73.9820944998
 
< 0.1%
-73.98234994
 
< 0.1%
Other values (40432)14462374
97.9%
ValueCountFrequency (%)
-2771.28541
< 0.1%
-2259.98321
< 0.1%
-2249.27171
< 0.1%
-2217.76661
< 0.1%
-2211.85771
< 0.1%
-2134.64821
< 0.1%
-2113.64991
< 0.1%
-2104.86011
< 0.1%
-2014.33921
< 0.1%
-2001.1941
< 0.1%
ValueCountFrequency (%)
112.404181
< 0.1%
80.8421251
< 0.1%
73.9887161
< 0.1%
73.9377981
< 0.1%
73.937791
< 0.1%
73.9377751
< 0.1%
73.9377591
< 0.1%
73.9377521
< 0.1%
38.0260541
< 0.1%
11.0478881
< 0.1%

pickup_latitude
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct64511
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.014399
Minimum-3547.9207
Maximum3310.3645
Zeros265104
Zeros (%)1.8%
Negative130
Negative (%)< 0.1%
Memory size112.7 MiB
2025-10-28T12:17:08.428916image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-3547.9207
5-th percentile40.70377
Q140.735512
median40.753147
Q340.767288
95-th percentile40.787636
Maximum3310.3645
Range6858.2852
Interquartile range (IQR)0.031776

Descriptive statistics

Standard deviation7.7899041
Coefficient of variation (CV)0.19467752
Kurtosis75801.186
Mean40.014399
Median Absolute Deviation (MAD)0.01564
Skewness-127.80996
Sum5.9127737 × 108
Variance60.682605
MonotonicityNot monotonic
2025-10-28T12:17:08.491574image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0265104
 
1.8%
40.7581482586
 
< 0.1%
40.7580112584
 
< 0.1%
40.7741092516
 
< 0.1%
40.7740942477
 
< 0.1%
40.7741012464
 
< 0.1%
40.7741172464
 
< 0.1%
40.7594262461
 
< 0.1%
40.7741322419
 
< 0.1%
40.7740782400
 
< 0.1%
Other values (64501)14489140
98.1%
ValueCountFrequency (%)
-3547.92071
 
< 0.1%
-3447.91971
 
< 0.1%
-3447.91771
 
< 0.1%
-3447.91671
 
< 0.1%
-3181.07811
 
< 0.1%
-3127.6371
 
< 0.1%
-3115.27371
 
< 0.1%
-3114.31571
 
< 0.1%
-3114.29491
 
< 0.1%
-3114.29223
< 0.1%
ValueCountFrequency (%)
3310.36451
< 0.1%
3210.36251
< 0.1%
3210.34471
< 0.1%
3210.3441
< 0.1%
3124.1981
< 0.1%
2317.65061
< 0.1%
2313.04571
< 0.1%
2210.1751
< 0.1%
2210.17241
< 0.1%
2152.3341
< 0.1%

dropoff_longitude
Real number (ℝ)

Zeros 

Distinct56249
Distinct (%)0.4%
Missing86
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean-72.594427
Minimum-2350.9556
Maximum2228.7375
Zeros275657
Zeros (%)1.9%
Negative14500825
Negative (%)98.1%
Memory size112.7 MiB
2025-10-28T12:17:08.558792image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-2350.9556
5-th percentile-74.006927
Q1-73.991211
median-73.980125
Q3-73.963898
95-th percentile-73.900284
Maximum2228.7375
Range4579.6931
Interquartile range (IQR)0.027313

Descriptive statistics

Standard deviation10.288603
Coefficient of variation (CV)-0.14172718
Kurtosis1761.6191
Mean-72.594427
Median Absolute Deviation (MAD)0.012726
Skewness0.68020638
Sum-1.0726937 × 109
Variance105.85536
MonotonicityNot monotonic
2025-10-28T12:17:08.623092image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0275657
 
1.9%
-73.9822394366
 
< 0.1%
-73.9822084264
 
< 0.1%
-73.9820794257
 
< 0.1%
-73.9822244222
 
< 0.1%
-73.9823684160
 
< 0.1%
-73.9822624055
 
< 0.1%
-73.9819564044
 
< 0.1%
-73.9822853993
 
< 0.1%
-73.982333980
 
< 0.1%
Other values (56239)14463531
97.9%
ValueCountFrequency (%)
-2350.95561
< 0.1%
-2343.48881
< 0.1%
-2331.83331
< 0.1%
-2236.51661
< 0.1%
-21571
< 0.1%
-2148.711
< 0.1%
-2032.4051
< 0.1%
-2006.79371
< 0.1%
-1991.07431
< 0.1%
-1952.60531
< 0.1%
ValueCountFrequency (%)
2228.73751
 
< 0.1%
2084.31
 
< 0.1%
1347.44461
 
< 0.1%
111.493881
 
< 0.1%
84.3157351
 
< 0.1%
84.308831
 
< 0.1%
80.8421251
 
< 0.1%
73.9377981
 
< 0.1%
73.9377593
< 0.1%
73.9377521
 
< 0.1%

dropoff_latitude
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct88766
Distinct (%)0.6%
Missing86
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean39.992189
Minimum-3547.9207
Maximum3477.1055
Zeros273357
Zeros (%)1.8%
Negative108
Negative (%)< 0.1%
Memory size112.7 MiB
2025-10-28T12:17:08.684643image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-3547.9207
5-th percentile40.689629
Q140.734684
median40.75362
Q340.768192
95-th percentile40.79287
Maximum3477.1055
Range7025.0262
Interquartile range (IQR)0.033508

Descriptive statistics

Standard deviation7.5370668
Coefficient of variation (CV)0.18846347
Kurtosis78387.389
Mean39.992189
Median Absolute Deviation (MAD)0.016384
Skewness-141.44463
Sum5.9094575 × 108
Variance56.807376
MonotonicityNot monotonic
2025-10-28T12:17:08.748223image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0273357
 
1.8%
40.7581482630
 
< 0.1%
40.7594262522
 
< 0.1%
40.7580112474
 
< 0.1%
40.7449151855
 
< 0.1%
40.7584531767
 
< 0.1%
40.7501491704
 
< 0.1%
40.7501721671
 
< 0.1%
40.7501181616
 
< 0.1%
40.7501561606
 
< 0.1%
Other values (88756)14485327
98.0%
ValueCountFrequency (%)
-3547.92071
< 0.1%
-3547.89531
< 0.1%
-3481.14261
< 0.1%
-3481.13431
< 0.1%
-3347.93311
< 0.1%
-3255.57351
< 0.1%
-3117.56841
< 0.1%
-3117.53741
< 0.1%
-3117.52291
< 0.1%
-3117.48851
< 0.1%
ValueCountFrequency (%)
3477.10551
< 0.1%
3210.36791
< 0.1%
3210.33811
< 0.1%
3177.11181
< 0.1%
1727.01671
< 0.1%
1705.88051
< 0.1%
1651.55351
< 0.1%
1442.60331
< 0.1%
1421.39341
< 0.1%
1330.63751
< 0.1%

Interactions

2025-10-28T12:16:19.511413image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:35.179426image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:41.392925image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:47.452597image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:54.445476image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:01.010582image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:06.875955image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:13.084285image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:20.316264image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:36.070212image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:42.120640image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:48.352134image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:55.268443image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:01.726806image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:07.570382image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:13.887536image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:21.138304image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:36.835827image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:42.878541image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:49.111599image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:56.099355image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:02.437832image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:08.326119image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:14.687894image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:21.980437image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:37.553977image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:43.562504image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:49.931866image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:56.866589image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:03.197079image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:09.165121image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:15.517083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:22.774956image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:38.260009image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:44.257952image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:50.820435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:57.659929image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:03.837485image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:09.921592image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:16.329778image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:23.561929image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:38.985966image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:44.945879image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:51.654151image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:58.474554image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:04.568512image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:10.585026image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:17.151266image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:24.371424image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:39.818741image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:45.759503image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:52.637851image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:59.392229image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:05.366920image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:11.420288image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:17.887427image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:25.082407image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:40.639214image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:46.559415image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:15:53.603001image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:00.290773image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:06.172486image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:12.218013image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2025-10-28T12:16:18.688902image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Correlations

2025-10-28T12:17:08.795538image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
dropoff_latitudedropoff_longitudepassenger_countpickup_latitudepickup_longituderate_codestore_and_fwd_flagtrip_distancetrip_time_in_secsvendor_id
dropoff_latitude1.0000.476-0.0050.5060.229-0.0881.000-0.063-0.1040.002
dropoff_longitude0.4761.000-0.0070.2120.4080.0540.0020.1220.0510.002
passenger_count-0.005-0.0071.000-0.009-0.0110.0041.0000.0290.0240.000
pickup_latitude0.5060.212-0.0091.0000.521-0.1191.000-0.072-0.0780.002
pickup_longitude0.2290.408-0.0110.5211.0000.1130.0000.0430.0170.002
rate_code-0.0880.0540.004-0.1190.1131.0000.0000.1520.1550.001
store_and_fwd_flag1.0000.0021.0001.0000.0000.0001.0000.0210.0241.000
trip_distance-0.0630.1220.029-0.0720.0430.1520.0211.0000.8440.017
trip_time_in_secs-0.1040.0510.024-0.0780.0170.1550.0240.8441.0000.009
vendor_id0.0020.0020.0000.0020.0020.0011.0000.0170.0091.000

Missing values

2025-10-28T12:16:26.169237image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2025-10-28T12:16:33.117174image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-10-28T12:16:56.694887image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

medallionhack_licensevendor_idrate_codestore_and_fwd_flagpickup_datetimedropoff_datetimepassenger_counttrip_time_in_secstrip_distancepickup_longitudepickup_latitudedropoff_longitudedropoff_latitude
089D227B655E5C82AECF13C3F540D4CF4BA96DE419E711691B9445D6A6307C170CMT1N2013-01-01 15:11:482013-01-01 15:18:1043821.0-73.97816540.757977-73.98983840.751171
10BD7C8F5BA12B88E0B67BED28BEA73D89FD8F69F0804BDB5549F40E9DA1BE472CMT1N2013-01-06 00:18:352013-01-06 00:22:5412591.5-74.00668340.731781-73.99449940.750660
20BD7C8F5BA12B88E0B67BED28BEA73D89FD8F69F0804BDB5549F40E9DA1BE472CMT1N2013-01-05 18:49:412013-01-05 18:54:2312821.1-74.00470740.737770-74.00983440.726002
3DFD2202EE08F7A8DC9A57B02ACB81FE251EE87E3205C985EF8431D850C786310CMT1N2013-01-07 23:54:152013-01-07 23:58:2022440.7-73.97460240.759945-73.98473440.759388
4DFD2202EE08F7A8DC9A57B02ACB81FE251EE87E3205C985EF8431D850C786310CMT1N2013-01-07 23:25:032013-01-07 23:34:2415602.1-73.97625040.748528-74.00258640.747868
520D9ECB2CA0767CF7A01564DF2844A3E598CCE5B9C1918568DEE71F43CF26CD2CMT1N2013-01-07 15:27:482013-01-07 15:38:3716481.7-73.96674340.764252-73.98332240.743763
6496644932DF3932605C22C7926FF0FE0513189AD756FF14FE670D10B92FAF04CCMT1N2013-01-08 11:01:152013-01-08 11:08:1414180.8-73.99580440.743977-74.00741640.744343
70B57B9633A2FECD3D3B1944AFC7471CFCCD4367B417ED6634D986F573A552A62CMT1N2013-01-07 12:39:182013-01-07 13:10:563189810.7-73.98993740.756775-73.86525040.770630
82C0E91FF20A856C891483ED63589F9821DA2F6543A62B8ED934771661A9D2FA0CMT1N2013-01-07 18:15:472013-01-07 18:20:4712990.8-73.98007240.743137-73.98271240.735336
92D4B95E2FA7B2E85118EC5CA4570FA58CD2F522EEE1FF5F5A8D8B679E23576B3CMT1N2013-01-07 15:33:282013-01-07 15:49:2629572.5-73.97793640.786983-73.95291940.806370
medallionhack_licensevendor_idrate_codestore_and_fwd_flagpickup_datetimedropoff_datetimepassenger_counttrip_time_in_secstrip_distancepickup_longitudepickup_latitudedropoff_longitudedropoff_latitude
14776605A8262FA0AFCB6C7229F6888EAFBDE0761BDF89260FEF1AE6FDDE839A0278D31DCMT2N2013-01-07 07:29:062013-01-07 08:19:391303221.3-73.77679440.645775-74.01093340.704960
14776606A8262FA0AFCB6C7229F6888EAFBDE0761BDF89260FEF1AE6FDDE839A0278D31DCMT1N2013-01-07 14:30:232013-01-07 14:42:1417111.4-73.94611440.801075-73.96653040.805023
14776607F33EF464441839C6F0DABAABBC93B45D313F66DD09C308EADA3B307F6B8CF7A9CMT1N2013-01-10 10:56:472013-01-10 11:05:5215451.4-73.97541040.759106-73.96183040.776527
1477660856CE01E7DBE0E6449FA1758F082D88844C6FE2FCFED26629D515D291EC1516A0CMT1N2013-01-10 14:50:012013-01-10 15:19:10117484.0-73.95734440.785732-73.99494240.742931
1477660932201027CDC62D654DC3AD9747A07C96B8DDB9F8143017E22104050B26C2A65DCMT1N2013-01-05 08:58:182013-01-05 09:05:5614583.2-73.99890140.734509-73.96682040.770138
14776610B33E71CD9E8FE1BE3B70FEB6E807DD15BAF57796E45D921BB23217E17A372FF6CMT1N2013-01-06 04:58:232013-01-06 05:11:2417813.3-73.98902940.759327-73.95374340.770672
14776611ED160B76D5349C8AC1ECF22CD4B8D5383B93F6DA5DEBDE9560993FA624C4FF76CMT1N2013-01-08 14:42:042013-01-08 14:50:2715031.0-73.99304240.733990-73.98248340.724823
14776612D83F9AC0E33F6F19869C243BE6AB6FE585A55B6772275374EF90AC9457DC1F83CMT1N2013-01-10 13:29:232013-01-10 13:34:4513210.9-73.97955340.785011-73.96826240.788158
1477661304E59442A7DDBCE515E33CD355D866E77913172189931A1A1632562B10AB53C4CMT1N2013-01-06 16:30:152013-01-06 16:42:2617301.3-73.96800240.762161-73.98599240.770542
14776614D30BED60331C79E3F7ACD05B325ED42FB5E1D2461A5BCC8819188DACEC17CD69CMT1N2013-01-05 20:38:462013-01-05 20:43:0612600.8-73.98222440.766670-73.98921240.773636